Skip to content

Reject covariate names colliding with reserved structural terms (DifferenceInDifferences, MultiPeriodDiD, TwoWayFixedEffects)#518

Merged
igerber merged 4 commits into
mainfrom
feature/covariate-name-collision
Jun 1, 2026
Merged

Reject covariate names colliding with reserved structural terms (DifferenceInDifferences, MultiPeriodDiD, TwoWayFixedEffects)#518
igerber merged 4 commits into
mainfrom
feature/covariate-name-collision

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Jun 1, 2026

Summary

  • Reject covariate names that collide with reserved structural terms in DifferenceInDifferences, MultiPeriodDiD, and TwoWayFixedEffects. These estimators build their coefficients dict by zipping a var_names list (structural terms + user covariate names appended verbatim) with the coefficient vector, so a covariate named like a structural term (const, the treatment/time columns, the {treatment}:{time} interaction, MPD period_{p} dummies, TWFE ATT, FE/unit/time dummy names, or an internal _-prefixed working column) silently overwrote that coefficient (dict last-write-wins) with no error. A shared validate_covariate_names guard now raises ValueError on such a collision (case-sensitive) and on duplicate covariate names.
  • Add a validate_design_term_names backstop on the FINAL assembled var_names (before the regression), catching fixed-effect-dummy collisions the covariate guard can't know up front — e.g. MultiPeriodDiD(fixed_effects=["period"]) whose dummies collide with the structural period_{p} event-study keys.
  • Derive reserved FE/unit/time dummy names via utils.fe_dummy_names (O(G) category-level, matching pd.get_dummies(drop_first=True) exactly incl. Categorical order) instead of materializing the dummy matrix — so the TWFE within-transform path keeps its high-cardinality scaling contract.
  • Potentially breaking: a fit that previously succeeded with a colliding/duplicated covariate name (silently returning a corrupted coefficient dict) now raises. Staggered / influence-function estimators key results by (g, t) and are unaffected; TripleDifference / SyntheticControl / SyntheticDiD do not expose covariates by name.

Methodology references (required if estimator / math changes)

  • Method name(s): DifferenceInDifferences, MultiPeriodDiD, TwoWayFixedEffects — input-validation guard on coefficient-dict labeling.
  • Paper / source link(s): N/A — correctness guard, not a methodology/estimand/SE change.
  • Any intentional deviations from the source (and why): None. Documented in docs/methodology/REGISTRY.md under the DiD/MPD/TWFE "covariate-name collision guard" Notes.

Validation

  • Tests added/updated: tests/test_utils.py (TestValidateCovariateNames, TestValidateDesignTermNames, TestFeDummyNames), tests/test_estimators.py (TestCovariateNameCollision), tests/test_estimators_vcov_type.py (TestTWFECovariateNameCollision + a within-path no-FE-dummy-materialization regression).
  • Backtest / simulation / notebook evidence (if applicable): N/A. Full default suite passes (7299 passed) on both backends; fe_dummy_names verified equal to pd.get_dummies(drop_first=True).columns across int/str/float/NaN/non-default-order Categorical.

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

igerber and others added 4 commits June 1, 2026 10:02
…iD family)

DifferenceInDifferences, MultiPeriodDiD, and TwoWayFixedEffects build their
coefficients dict by zipping a var_names list (structural term names + user
covariate column names appended verbatim) with the fitted coefficient vector.
A covariate whose name equaled a reserved structural name (const; the
treatment/time column names; the {treatment}:{time} interaction; MPD period_{p}
dummies and {treatment}:period_{p} interactions; TWFE ATT; fixed-effect / unit /
time dummy names; or an internal _-prefixed working column such as _treat_time /
_did_treatment / _treatment_post) silently overwrote that structural coefficient
via dict last-write-wins -- e.g. a covariate named const dropped the intercept --
with no error.

Add a shared validate_covariate_names helper (utils.py) called in each fit()
before the design matrix is built: it raises ValueError on a collision
(case-sensitive, so Const is allowed) and on duplicate covariate names.
Fixed-effect/unit/time dummy reserved names are taken from the same
pd.get_dummies(..., drop_first=True) call used to build them, so they match
exactly (including for Categorical columns with a non-default category order).
The TWFE guard fires on both variance paths: the within-transform path returns
only {"ATT": att}, but a covariate named _treatment_post would still clobber the
internal interaction column.

Potentially breaking: a fit that previously succeeded with a colliding (or
duplicated) covariate name -- silently returning a corrupted coefficient dict --
now raises. The staggered / influence-function estimators key results by (g, t)
tuples / relative-time indices and are unaffected; SyntheticDiD overrides fit()
and never reaches the base-class guard.

Tests: tests/test_utils.py::TestValidateCovariateNames,
tests/test_estimators.py::TestCovariateNameCollision,
tests/test_estimators_vcov_type.py::TestTWFECovariateNameCollision.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ions

The upfront validate_covariate_names guard only validates the user `covariates`
list. Fixed-effect dummy names (`{fe}_{value}`) are appended to var_names AFTER
that guard and were not themselves checked against structural names — so
MultiPeriodDiD(fixed_effects=["period"]) (a non-time FE whose dummies are named
period_1, period_2, ...) could still collide with the structural period_{p}
event-study keys and silently overwrite them in the coefficients dict, the exact
bug class this PR addresses (caught in local review).

Add a shared validate_design_term_names(var_names) backstop in utils.py that
enforces uniqueness on the FINAL assembled term list (structural + covariates +
fixed-effect dummies) right before the coefficients dict is built, in all three
estimators. This catches data-dependent collisions (FE-dummy vs structural,
FE-dummy vs FE-dummy) that cannot be known up front. The upfront covariate guard
is kept for its precise, fail-fast messaging on the common case.

Tests: tests/test_utils.py::TestValidateDesignTermNames + an MPD fixed-effect-
dummy-vs-period_{p} regression in TestCovariateNameCollision. Full estimator +
TWFE-methodology suites pass with the backstop (no false positives on existing
fixed-effect fits).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (perf)

The collision guard reserved fixed-effect/unit/time dummy names via
pd.get_dummies(col, ...).columns, which materializes the full (n x G) dummy
matrix. In TwoWayFixedEffects this ran UNCONDITIONALLY — including the default
within-transform (hc1/classical/conley) path whose entire purpose is to avoid
expanding high-cardinality FE dummies — so a large-G panel paid an avoidable
O(n x G) allocation (latency / potential OOM) before estimation (caught in
local review).

Add utils.fe_dummy_names(col, prefix): derives the same names get_dummies would
produce, from the category levels only (O(G)), via pd.Categorical(col).categories
(sorted unique for a plain column, declared order for a Categorical) + drop_first.
Verified to match pd.get_dummies(drop_first=True).columns exactly across int /
str / float / NaN / non-default-order Categorical dtypes. Wire it into the DiD,
MPD, and TWFE reserved-name construction in place of get_dummies. The within-
transform TWFE path now performs no FE-dummy materialization at all.

Tests: tests/test_utils.py::TestFeDummyNames (equivalence to get_dummies,
incl. Categorical order) + a TwoWayFixedEffects within-path regression that
monkeypatches pd.get_dummies to raise and asserts the hc1 fit still succeeds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
validate_design_term_names ran just before coef_dict construction — i.e. AFTER
the OLS fit — in DifferenceInDifferences and MultiPeriodDiD, so a duplicate
assembled term name (e.g. an MPD fixed-effect dummy colliding with a structural
period_{p} key) drove a wasted rank-deficient fit and emitted a misleading
multicollinearity warning before the intended ValueError (local review P3).
Move the check to immediately after var_names is fully assembled and before the
regression call in both estimators (TwoWayFixedEffects already checked pre-fit).
var_names is not mutated between assembly and coef_dict, so this is behavior-
preserving except that the ValueError now fires fast and warning-free.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

  • The PR affects DifferenceInDifferences, MultiPeriodDiD, and TwoWayFixedEffects, but it does not change estimator equations, identification, weighting, or variance/SE formulas.
  • The methodology-facing behavior change is documented in docs/methodology/REGISTRY.md, so I do not see an undocumented source-material deviation.
  • The new validate_covariate_names() plus validate_design_term_names() backstop closes the actual silent last-write-wins overwrite surface before coef_dict construction in the modified estimators.
  • Added tests cover the main collision cases across DiD/MPD/TWFE, formula parsing, fixed-effect-dummy collisions, and the TWFE “do not materialize FE dummies on the within path” invariant.
  • I found one P3 performance concern on TWFE’s default within-transform path; it is not blocking.
  • I could not run pytest here because this environment does not have pytest or the project dependencies installed; review is based on source inspection plus syntax-only compilation of the changed Python files.

Methodology

Code Quality

Performance

  • Severity: P3. Impact: TwoWayFixedEffects.fit() now always synthesizes all unit/time FE dummy names via fe_dummy_names() on the default within-transform branch, even though only _treatment_post is a true collision surface on that branch. This is documented behavior, so it is not blocking, but it adds avoidable O(G) string/set work to the high-cardinality path the branch is meant to protect. Concrete fix: make the reserved-name set branch-specific in TWFE: keep FE-dummy reservation inside the full-dummy use_full_dummy branch, and reserve only names that can actually collide on the within-transform branch. References: diff_diff/twfe.py:L301-L317, diff_diff/utils.py:L185-L217

Maintainability

  • Severity: none. Impact: No new PR-scoped maintainability issue beyond the non-blocking TWFE performance note above. Concrete fix: none.

Tech Debt

  • Severity: none. Impact: I did not identify new untracked deferred correctness work; nothing in TODO.md appears to mitigate or contradict the changed behavior here. Concrete fix: none.

Security

  • Severity: none. Impact: No PR-scoped security issue or secret exposure found. Concrete fix: none.

Documentation/Tests

  • Severity: none. Impact: The documentation updates are aligned with the code, and the new tests are well targeted to the changed surfaces: utility helpers, DiD/MPD collisions, TWFE path coverage, and the TWFE no-get_dummies() regression. Concrete fix: none. References: tests/test_utils.py:L1310-L1388, tests/test_estimators.py:L3560-L3723, tests/test_estimators_vcov_type.py:L2156-L2227

  • Review limitation: I could not execute the test suite locally because pytest and the project’s runtime dependencies are unavailable in this environment. I did verify that the modified Python files parse successfully via compile(...).

@igerber igerber added the ready-for-ci Triggers CI test workflows label Jun 1, 2026
@igerber igerber merged commit 3f3b95a into main Jun 1, 2026
33 of 34 checks passed
@igerber igerber deleted the feature/covariate-name-collision branch June 1, 2026 16:06
@igerber igerber mentioned this pull request Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant